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ABSTRACT 


The first part of this thesis aims to identify and analyze what aspects of the MIL- 
HDBK-217 prediction model are causing the large variation between prediction and field 
reliability. The key findings of the literature research suggest that the main reason for the 
inaccuracy in prediction is because of the constant failure rate assumption used in MIL- 
HDBK-217 is usually not applicable. Secondly, even if the constant failure rate 
assumption is applicable, the disparity may still exist in the presence of design and 
quality related problems in new systems. A possible solution is to apply reliability growth 
testing (RGT) to new systems during the development phase in an attempt to remove 
these design deficiencies so that the system’s reliability will grow and approach the 
predicted value. In view of the importance of RGT in minimizing the disparity, this thesis 
provides a detailed application of the AMSAA Extended Reliability Growth Models to 
the reliability growth analysis of a combat system. It shows how program managers can 
analyze test data using commercial software to estimate the system demonstrated 
reliability and the increased in reliability due to delayed fixes. 
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EXECUTIVE SUMMARY 


One of the major problems in today’s military systems is the issue of poor 
reliability, and the inconsistency between predicted and field reliability. Experience has 
shown that two reasons are: 1) the inability to consistently predict field reliability using 
reliability prediction models, and 2) inadequate emphasis on reliability testing prior to 
system fielding, as more emphasis is being placed on meeting perfonnance requirements 
than reliability requirements. The first part of this thesis aims to identify and analyze 
what principal aspects of the MIL-HDBK-217 prediction model are causing the large 
variation between prediction and field reliability with the ultimate goal of minimizing the 
gap. The second part of the thesis demonstrates how the Duane reliability growth model 
can be used as a useful tool for the purpose of reliability growth planning and also to 
apply the AMSAA Extended Reliability Growth Models for analyzing reliability growth. 

The key findings of the literature research suggest that the main issues are some 
of the inherent assumptions of the MIL-HDBK-217 prediction model. First, the constant 
failure rate assumption that has been generally applied in reliability prediction is usually 
not applicable. However, Drenick’s theorem has proven that complex repairable systems, 
under certain constraints, can be well represented by the exponential distribution. The 
reliability engineer must be able to recognize when the mathematical simplicity of the 
constant failure rate model can be used without a substantial penalty in prediction 
accuracy. Secondly, the lack of accurate failure rates data is also another reason because 
the task of acquiring field data of components is very time consuming. A well designed 
part is less likely to fail early, leading to extended waiting time for any useful 
information. A possible solution is to apply accelerated life testing to components to 
shorten the waiting time required for acquiring failure rates data. Lastly, even if the 
exponential distribution is applicable, the disparity between predicted and field reliability 
may still exist in new systems because of unexpected failure modes that may arise in the 
presence of design and quality deficiencies which will prevent the system from reaching 
the predicted value. A possible solution is to apply reliability growth testing (RGT) to 
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new systems during the development phase in an attempt to remove these design 
deficiencies so that the system’s reliability will grow and approach the predicted value. In 
contrast to the MIL-HDBK-217 prediction model, AMSAA reliability growth models 
assume that system failures during development follow a Non-Homogeneous Poisson 
Process (NHPP). 

In view of the importance of RGT in minimizing the disparity, this thesis provides 
a detailed application of the AMSAA Extended Reliability Growth Models to the 
reliability growth analysis of a combat system. It shows how program managers can 
analyze test data using commercial software to estimate the system’s demonstrated 
reliability and the increased in reliability due to delayed fixes. The example combat 
system consists of two main subsystems. The reliability growth for both the subsystems is 
tracked over three phases of testing. Reliability is tracked on a phase by phase basis using 
test data collected within each test phase. The type of reliability growth model selected 
for is based on the type of management approach employed in each test phase. The three 
types of AMSAA reliability growth models are: 1) AMSAA Extended Model for Test- 
Fix-Test, 2) AMSAA Extended Test-Find-Test Projection Model, and 3) AMSAA 
Extended Model for Test-Fix-Find-Test. 

The results of the reliability analysis for the combat system show that the 
demonstrated system reliability for both subsystems is initially low but improves as 
testing progresses. Reliability is finally estimated to meet the predicted value as failure 
modes are discovered and eliminated through the Test-Analyze-And-Fix (TAAF) process 
towards the target reliability by application of the TAAF approach. I conclude that the 
application of RGT during the developmental phase is effective in minimizing the 
disparity between predicted and field reliability. Systems that bypass development testing 
will experience low reliability in the field, which is one of the main causes of disparity 
between predicted and field reliability. 

There are also some important lessons learned on the use of the reliability growth 
models from this thesis. For the Duane’s Model, the total test time required for an RGT 
program is sensitive to the system’s initial reliability, initial test time, and growth rate. In 
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most practical cases, the total test time is usually fixed due to time and resources 
available in the development program 

The use of failure mode designation in AMSAA Extended Reliability Growth 
Models has proven to be beneficial as it can provide many useful metrics in reliability 
growth analysis and for decision making during the test program. They are: 1) initial 
system reliability at the beginning of a test phase, 2) the average effectiveness factor (EF) 
of remedying failure modes, 3) fraction of seen and unseen Type BD failure modes, and 
4) system failure rate breakdown for individual failure modes. Knowing the failure rate 
breakdown of individual failure modes in the system is important as it enables easy 
identification of failure modes with relatively high failure rate. It is also important to note 
that the final system reliability is sensitive to the assigned value of EF for Type BD 
failure modes. To prevent over estimation of the system final reliability, a conservative 
EF should be assigned since the actual effectiveness of the delayed fixes cannot be 
determined without further testing. 

For new systems under development, the use of the AMSAA NHPP model 
provides a better representation of the system’s failure rate than the exponential 
distribution because the failure rate is varying with time as testing progresses. Once the 
system matures through a period of testing and reliability growth has reached a plateau, 
the system’s failure rate will tend towards being well represented by an exponential 
distribution. 


xvii 
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I. INTRODUCTION 


A. BACKGROUND 


Reliable weapon systems are critical elements for fighting and winning wars, and 
reliability is an effective force multiplier that contributes towards higher operational 
readiness and a reduced logistics footprint. One of the major problems in today’s military 
systems is the issue of poor reliability, and the inconsistency between predicted and field 
reliability. Experience has shown that two reasons for this inconsistency are: 1) the 
inability to consistently predict field reliability using reliability prediction models, and 2) 
inadequate emphasis on reliability testing prior to system fielding as more emphasis is 
being placed on meeting performance requirements than reliability requirements [Ref. 1 
and Ref. 2]. 

This chapter first introduces the issues concerning the inability to predict field 
reliability, the importance of reliability testing for military systems, and follows by 
introducing the concept of reliability growth. The scope and objectives of this research 
are then presented along with the potential benefits. 

Within the military, there is a need in the early stages of the development program 
to accurately predict the expected field reliability of military systems for logistics and 
operational planning purposes. These include the determination of spares quantity, 
forecast of maintenance support cost, life cycle cost, and systems availability analysis. 
These analyses require accurate reliability predictions. Research has shown, however, 
that the field reliability of weapon systems has often failed to measure up to its predicted 
Mean-Time-Between-Failure (MTBF) [Ref. 1], 

Empirically it has been found that the ratio of the predicted MTBF to its field 
MTBF for military systems can vary by as much as 20:1 [Ref. 1]. Table 1 presents some 
examples of this disparity. 
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Equipment 

Reliability Ratio 

Predicted: Field 

Airborne Avionics 

>20:1 

Airborne Radar 

5.0:1 

Airborne Fighter 

9.1:1 

Airborne Transport 

2.3:1 


Table 1. Ratio disparity between predicted and field MTBF [After Ref. 1] 


Reliability prediction is perfonned during the early design phase, when the 
prototype is not yet built, to estimate the expected field reliability of the system. The 
most widely used prediction method in the military is the MIL-HDBK-217. Although 
DoD has discontinued updates of MIL-HDBK-217F, this standard is still widely used in 
the military. Its effectiveness has not been clearly established since it has been shown that 
there exist large variations between predicted and field reliability. Research efforts are 
required to examine the problems of the MIL-HDBK-217 prediction model that have 
caused this disparity. 

The inability to relate predicted reliability to field reliability could have severe 
impact from both the logistics and operational perspective. A recent analysis performed 
on the Comanche helicopter by an NPS student indicates that missing the predicted 
availability by just one percent could increase the life-cycle Operation & Support (O&S) 
cost by more than $75 million [Ref. 3]. 

As important as reliability prediction is, its value starts to diminish once 
prototypes are built and the reliability can be assessed via testing. Reliability prediction 
and reliability testing play different roles but they complement one another at different 
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stages of the product development cycle. Reliability testing is performed to ensure that 
the fielded system meets the specified level of reliability. 

Over the years, there were numerous reported cases of military systems exhibiting 
poor reliability. One example is the Hunter Unmanned Aerial Vehicle (UAV) System 
[Ref. 4]. The urgent need for the US Army to have a UAV System forced the Hunter 
System to be fielded without going through its development phase which means that the 
system was not adequately tested. Consequently, several Air Vehicles (AVs) were lost 
due to various failures and that finally resulted in a decision to terminate the production 
program. The lesson learned is to recognize the significance of reliability testing during 
the development phase. Reliability can only be validated with rigorous testing under 
actual combat conditions. This is especially important for complex and state-of-the-art 
weapon systems. There are too many uncertainties and risks involved, especially in the 
area of systems design, and it is virtually impossible for designers to predict in advance 
all possible sources of failure modes. Failure to achieve an acceptable level of reliability 
at this late stage of development can have a devastating impact on the program, including 
fielding a less reliable weapon system and incurring additional cost for retesting and 
redesign. 

Reliability testing does not guarantee that reliability targets will be met ultimately 
but having a strong emphasis on reliability testing should substantially increase the 
chances of meeting these objectives. During system development, the eventual goal for 
the system’s reliability is known as the reliability target. However, the initial prototypes 
produced will almost certainly contain design, quality, and other engineering related 
flaws that prevent a prototype from reaching the target immediately. In order to improve 
the reliability, the prototypes are subjected to intensive testing to identify and implement 
corrective actions to improve the design. This process of testing, fixing, and testing to 
increase the system’s reliability is kn own as reliability growth. Reliability growth is 
generally quantified by an increase in mean time between failures over time. The 
intervals between failures will become longer on average if there is positive reliability 
growth. On the other hand, if negative growth is occurring, these intervals will tend to be 
shorter. For no growth, the intervals will retain the same mean. 
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The estimation of system reliability involves the use of a reliability growth model. 
A reliability growth model is an analytical model that represents the reliability of the 
system during the development process. It accounts for the changes in reliability due to 
all corrective actions incorporated during the developmental phase. The basic principle of 
a reliability model is to apply the failure data collected during prototype testing to 
determine the reliability of the system. A reliability model is also used for developing a 
test plan to determine the amount of test time required to meet the reliability targets. 
Once the test plan is developed, the model can be used as data is collected to estimate the 
reliability of the system during the test phase in order to know how much additional 
testing is required to meet the target. Extrapolating a growth curve beyond the current 
data estimates what reliability a program can be expected to achieve providing that the 
conditions of the test and the engineering effort to improve reliability are maintained at 
their present levels. 

Although many models existed for modeling reliability growth, the Duane and the 
US Army Materiel Systems Analysis Activity (AMSAA) models are among the most 
widely used in the military [Ref. 5]. The detenninistic nature of the Duane’s model is 
commonly used for constructing the idealized growth curve in reliability growth 
planning. The AMSAA model employs the Weibull process to model reliability growth 
and its statistical nature allows estimation of unknown parameters using test data which 
makes it a useful tool for reliability assessment. 


B. OBJECTIVES AND SCOPE OF RESEARCH 


The first part of this thesis aims to identify and analyze what principal aspects of 
the MIL-HDBK-217 prediction model are causing the large variation between prediction 
and field reliability. 

The second part of the thesis aims to demonstrate the use of the Test-Analyze- 
And-Fix (TAAF) concept for the reliability growth analysis of a combat system. The 
main intent is to demonstrate how the Duane reliability growth model can be used as a 
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useful tool to construct an idealized growth curve for the purpose of reliability growth 
planning and also to apply the AMSAA Extended Reliability Growth Models for 
analyzing reliability growth. 

Lastly, lessons learned and recommendations on reliability growth based on this 
research will be presented in this thesis. 


C. POTENTIAL BENEFITS OF RESEARCH 


This research consolidates some important findings that has given rise to the 
inaccuracy in the MIL-HDBK-217 reliability prediction with the ultimate goal of 
minimizing the gap between predicted and field reliability. 

This thesis also shows how program managers can plan and analyze test data using 
commercial software to estimate the system’s demonstrated reliability and estimate the 
increased in reliability due to delayed fixes. 
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II. ANALYSIS OF THE RELIABILITY DISPARITY 


A. INTRODUCTION 


Within the military, accurate prediction of system reliability plays a critical role 
from both the logistics and operational perspective. MTBF figures are used for many 
logistics and operational planning activities [Ref. 6]. They include the following: 

Spares Provisioning . Detennination of spare quantities purchased to meet 
operational availability. Components with higher failure rates needs to be stocked at a 
higher number. 

Development of Maintenance Strategies . In many cases, MTBF is used to 
determine the preventive maintenance intervals of a component. 

Estimation of Life Cycle Cost. Estimation of the total system cost on a yearly 

basis. 

Unfortunately, there are a host of factors that give rise to the disparity between 
predicted and field MTBF. The focus here is to identify and analyze principal aspects of 
the MIL-HDBK-217 prediction model that are causing the large variation between 
prediction and field reliability. 

The remaining of this chapter will first discuss the key concepts pertinent to the 
understanding of the research theme which include the “bathtub” curve, the exponential 
distribution, and the principles of reliability prediction and follow by a discussion on the 
results, conclusions and recommendations. 

B. KEY CONCEPTS 


This section provides a fundamental understanding of the key concepts related to 
reliability prediction such as the “bathtub” curve, the exponential distribution and also the 
principles of reliability prediction in order to understand the research theme—the 
reliability disparity. 
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1 . 


The Bathtub Curve 


Figure 1 shows a “bathtub” curve that is often used in the field of reliability to 
describe the failure rate behavior of a system over its life cycle. The vertical axis of the 
“bathtub” curve represents the hazard rate or the instantaneous failure rate. The hazard 
rate applies only to non repairable systems in which only one failure can occur. For 
repairable systems the term failure rate or rate of occurrence of failure is more 
appropriate. The “bathtub” curve consists of three distinct regions: infant mortality, 
useful life and wear-out [Ref. 7]. 

The infant mortality region exhibits a decreasing failure rate, characterized by 
early failures attributable to defects in design, manufacturing or construction. The failure 
rate decreases with time as the design defects are detected and repaired. The failure rate is 
the probability of failure in the next interval of time given that an item has survived to a 
certain age, divided by the length of the interval. It is an important function in reliability 
analysis since it shows changes in probability of failure over the lifetime of a system. One 
way to eliminate such failures is through design and production quality control measures 
that will reduce variability and hence infant mortality failures [Ref. 12]. 

The useful life region by assumption has a reasonably constant failure rate, 
characterized by random failures. These failures are likely caused by unavoidable load 
rather than any inherent defect in the system. There are many forms of possible external 
loadings such as temperature fluctuations, vibration, power surges and moisture variation. 
Random failures can be reduced by increasing the robustness of the design and/or 
controlling the external environment. 

The wear-out region has an increasing failure rate characterized by the aging 
phenomena. The typical failure mechanisms are corrosion, fatigue cracking, 
embrittlement, and diffusion of materials. 

In reliability prediction, the failure rate of a system has often been assumed to be 
constant which resembles the useful life region of the bathtub curve as shown in Figure 1. 
In reality, the assumption of constant failure rate is more representative of an electronic 
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system rather than a mechanical system. Failure occurrences in electronic systems are 
considered as random and are assume to follow a Poisson process. 

On the other hand, the failure distribution of mechanical hardware is characterized 
by an initial wear-in period and followed by a long span of increasing failure rate. The 
primary failure mechanisms for mechanical systems are corrosion, fatigue and other 
cumulative effects. 



Figure 1. The “bathtub” curve [From Ref. 7] 

2. The Exponential Distribution 

The exponential distribution models the failure rate in the useful life region of the 
bathtub curve as it assumes that the rate at which the system fails is independent of its 
cumulative age [Ref. 8]. This assumption greatly simplifies the mathematics involved in 
reliability analysis as it is much easier to calculate with an assumed constant failure rate 
than to derive the parameters of a two-parameter distribution (e.g,. Weibull). This is one 
of the main reasons for its wide application in many reliability analyses. 
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Further, it is the lack of memory property of the exponential distribution that 
assumes a repaired system is as good as new. For the exponential distribution, reliability 
as a function of time and failure rate, X , is written as 

R(t) = e~ Xt . (2.1) 


3. Reliability Prediction Model 

MIL-HDBK-217, the Military Handbook for “Reliability Prediction of Electronic 
Component” is the standard reference used in the military for reliability prediction of 
electronic equipment parts. It was published by the Department of Defense (DoD) in the 
1960s. Since then it has been updated several times, with the most recent Revision F 
Notice 2, released in February 1995 [Ref. 9]. Table 2 shows some of the prediction 
models available in the military and commercial industry. 


Model 

Description 

MIL-HDBK-217 

Original worldwide standard (MIL-STD-217) for commercial & 
military electronic components 

Telcordia SR-332 

Original Bellcore standard for commercial grade electronic 
components 

PRISM 

Originally developed by the Reliability Analysis Center (RAC), 
incorporates process grading factors 

CNET 93 

Developed by France Telecom 


Table 2. Reliability models/standards [After Ref. 12] 


Conventional reliability prediction assumes that all failures are independent. It 
first defines the failure rate of all the key components that made up the system and sums 
them up to obtain the overall system failure rate, assuming a series system. The MIL- 
HDBK-217 reliability prediction model assumes a constant failure rate for all the 
components. The validity and usefulness of this assumption has often been challenged by 

practitioners in the field of reliability. Many have denounced the use of this assumption 
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as not being practical as it assumes that a system does not wear out over time. MIL- 
HDBK-217 consists of two approaches: Parts-Count and Parts-Stress. 

a. Parts Count 

Parts-Count approach is simpler as it requires less information than the 
Parts-Stress approach. It only requires knowledge of the quantities of components, 
application environment and quality factor, n Q . A quality factor that is above 1.0 implies 

a poor quality component. The prediction for each part is governed by the application of a 
quality factor to a base failure rate. The quality factor for most standard electronic 
components can be found in MIL-HDBK-217. This approach is most useful in the early 
design stage when the system hardware is not yet available. 

MIL-STD-217F parts count defines the overall equipment failure rate as: 

^EQUIP =Y J N i^ g n Q ) i (2.2) 

1=1 

A = Failure rate of the i th generic part 

n = Number of generic part categories 

Nj = Quantity of the i th generic part 

Kq = Quality factor of the of the i th generic part 

b. Parts Stress 

The Parts-Stress approach is more complex as it takes into account the 
various stress factors such as temperature, environment, quality, electrical, etc, on the 
component. The electrical stress is usually defined as a ratio of the operating value to the 
rated value. For instance, the defining stress factor for a resistor is current. Therefore, the 
operating current and rated current are used in the part stress calculation model. This 
approach is more applicable later in the design phase when the hardware and knowledge 
of the operating environment are available in order to estimate the various stress factors. 

The models for the MIL-HDBK-217 Parts-Stress approach is much more 
detailed and varied across part types. The model for the low frequency diode is shown 
below [Ref. 17]. 
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(2.3) 


^p t 7T S 7T qK E 


2 = Part failure rate 


X, = Base failure rate 


n T = Temperature factor 


n E = Environment factor 


n 0 = Quality factor 


= Electrical stress factor 


The accuracy of both approaches is highly dependent upon the availability 
and accuracy of data such as the base failure rate and the various required factors. 


C. 


MAJOR FACTORS AFFECTING RELIABILITY PREDICTION 


There are a number of studies that either directly or indirectly address the problem 
of the reliability disparity. This section focuses on the limitations of the MIL-HDBK-217 
model. The key findings of the literature research suggest that the disparity stems from 
some inherent assumptions of the MIL-HDBK-217 model. For example, the constant 
failure rate assumption that has been generally applied in reliability prediction is usually 
not applicable. The lack of accurate field failure rates of components or parts can also 
affect prediction accuracy. Further, the prediction model cannot predict unexpected 
failure modes that occur in the field due to poor design and poor quality control. 

1. Inapplicability of the Constant Failure Rate Assumption in MIL- 
HDBK-217 Reliability Prediction Model 

System failures can be assumed to follow a Poisson process if the times to failure 
of all the components that make up a system can be regarded as exponential and 
component failures to be independent. The rate of failure occurrence of the system can 
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then be obtained by summing up the failure rates of the individual components. This has 
been regarded as reasonable for electronic components, and thus provides the basis for 
MIL-HDBK-217 prediction model. The exponential distribution also assumes all repairs, 
no matter how minor, restore the system to an “as new” condition. This assumption is 
often in strict contrast to reality for the following reasons [Ref. 10]: 

1. Failure and repair of one part may cause damage to other parts. Therefore, the 
times between successive failures are not necessarily independent. 

2. Repairs may not totally renew the system. Repairs can be imperfect or they 
introduce other defects leading to failures of other parts. The lack of memory property of 
the exponential distribution might not be valid in every case. 

Since component failures are not always independent, the principle of summing 
up the failure rates of the individual components to obtain the overall system’s failure rate 
might not result in the best estimate. 

Below are two examples to further describe the limitations of using the constant 
failure rate assumption of the exponential distribution for reliability prediction. 

Figure 2 shows the results of using the exponential and Weibull distributions to 
model the human mortality rate [Ref. 11]. Similarly, the failure rate distribution is also 
representative of a system with a short period of useful life follows by a long period of 
wear-out. It can clearly be seen from Figure 2 that the exponential distribution has 
grossly under-estimated the later failure rate while over-estimating the initial failure rate. 
In contrast, the Weibull distribution is more suitable in such a situation. The purpose of 
this example is to show that the constant failure rate assumption does not apply to a 
system with a dominant wear-out failure mechanism. 
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H uman Mortality Rate 



Figure 2. Human hazard rate analysis [From Ref. 11] 
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Figure 3. Constituent curves of the “bathtub” curve [From Ref 14] 
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Further, the applicability of the constant failure rate assumption also hinges 
strongly upon the relationship between the system’s nature and its life cycle [Ref. 14]. 
Figure 3 shows a typical bathtub curve with the three distinct regions. The failure rate of 
an electronic equipment with a maximum of life cycle of five years can be approximated 
by Case I. Case II approximates a mechanical equipment with a life cycle of ten years . In 
comparison between the two cases, the results indicate that the constant failure rate 
assumption has provided a better approximation for a live year period. The reliability is 
given by the following equation. 

R(t) = e H(t) (2.4) 

H(t) = Cumulative hazard rate 

The failure rate was first under-estimated during the early failure region and then 
over-estimated during the constant failure rate region. Overall, it still provides a fairly 
good approximation. 

On the other hand, the error between prediction and actual is simply too great for 
a ten year period due to the relatively long period of increasing failure rate in the wear- 
out region. This brings to an important conclusion that the use of the constant failure rate 
assumption is highly dependent upon the system’s life cycle. 

In addition, a similar conclusion that can be drawn from the two previous 
examples is that the constant failure rate assumption tends to produce a conservative 
estimate of the system’s overall failure rate that is dependent upon the relative period of 
wear-out region over its life cycle. As observed from Figure 3, the wider the wear-out 
region over the life cycle, the greater will be the error margin. This further reinforces the 
point that it is not suitable for predicting failure rates of a system with a dominant wear- 
out failure mechanism. Reliability prediction using this assumption for a system 
characterized by a long period of wear-out provides little insights from the logistics 
planning perspective as it can result in severe spares under-purchased. All these reasons 
explain why reliability prediction using the constant failure rate assumption often yields 
inconsistent results from field reliability. 
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2. 


Lack of Accurate Failure Rates Data 


A reliability prediction model is effectively a set of “best guesses” and to achieve 
any degree of accuracy they must use empirically acquired field data. Prediction accuracy 
to a large extent depends on the amount of field data available and the painful fact is that 
data collection takes a long time [Ref. 13]. 

The task of acquiring field data of components is not a simple task because it 
takes time for a component to fail before meaningful data on failure rates can be 
gathered. A well designed part is less likely to fail early, leading to extended waiting 
time for any useful information. Because the task is so time consuming, there are 
relatively few sources, usually from the manufacturers themselves. The largest sources of 
field data are the Non-electronic Parts Reliability Data (NPRD-95) and Electronic Parts 
Reliability Data (EPRD-97) produced by the military [Ref. 18]. These were compiled 
through years of observation, repair records, and other activities. Since failure rate 
depends mainly on design and application, these data are not representative of all cases. 
Further, the rapid development of electronic technology limits the ability to collect ample 
data for any particular technology. 

A possible solution to shorten the waiting time for acquiring failure rates data is to 
apply accelerated life testing (ALT) to components. Accelerated life testing are 
component life tests with components operated at high stress and failure data observed 
[Ref. 22]. 


3. Inability to Predict Unexpected Failures Modes Due To Poor Design 
and Quality Related Problems 


Westinghouse Defense and Electronic Center perfonned a case study on a 
complex Electric Countermeasures (ECM) military radar system that underwent a 
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Reliability Demonstration Test (RDT) to study the differences between predicted and 
field reliability and analyzed these problems in light of the MIL-HDBK-217 prediction 
model [Ref. 15]. 



Predicted MTBF 

RDT MTBF 

Radar System 

282 hours 

100 hours 


Table 3. Predicted and RDT MTBF [After. Ref. 15] 


It was found that the main differences are related to the assumptions made of the 
quality of design and the adherence to the established and specified quality control 
procedures in producing the parts. The MIL-HDBK-217 model inherently assumes that 
certain standards are followed in these areas based on specified engineering requirements 
but this assumption is not always valid in all cases. Two examples of failures that were 
identified during the RDT test will be presented to support this claim. 

The first failure to be discussed is due to a design deficiency of a thin film RF 
amplifier. This failure arises because of inadequate clearance between the toroid of the 
RF amplifier and the lid case that was not foreseen during the initial design of the device. 
The toroid was being mounted too closely to the device lid and that subsequently resulted 
in a short circuit due to contact with the lid of the case after several cycles of thermal 
cycling that caused the toroid to move relative to its original position. The assumption 
during the initial reliability prediction of this device was that all design considerations for 
this device were completely satisfied. Obviously, these assumptions were not valid for 
this case. 

The second failure concerns thick film devices that consists of many discrete parts 
and solder joints. Solder balls form as a result of the solder flow process being out of 
control, in that the solder flow temperature deviated from the specified range during the 
device manufacturing process. The resulting solder balls, which were loosely attached at 
various points of the device, broke loose and lodged between various chips and substrate 
causing component to substrate short circuit. Reliability prediction is unable to predict 
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these unexpected failure modes that arise as a result of poor quality control as it 
inherently assumes that all processes are in proper control. 

The two failures previously described were a direct result of poor design and lack 
of quality control, respectively, which were identified during RDT. During the initial 
reliability prediction prior to production and test, it is almost impossible to know in 
advance how good these control methods and engineering designs would be. Therefore it 
is extremely important to be aware of the differences between the inherent assumptions 
of the MIL-HDBK-217 prediction model and the many uncertainties that can happen 
during the actual engineering process. 


D. RECOMMENDATIONS 


The constant failure rate model is mathematically simple for reliability prediction 
but it is not always applicable. It serves as a good approximation for a system that is 
characterized by a long period of useful life and a short period of early failure. In order to 
improve the precision of reliability prediction, the reliability engineer must be able to 
recognize when the mathematical simplicity of the constant failure rate model can be 
used without a substantial penalty in prediction accuracy. This can be achieved by 
analyzing the failure rate distribution of a system over its intended life and deciding if the 
exponential distribution is applicable. The failure rate distribution of a system can be 
estimated by analyzing the failure trends of similar class of systems. 

It is also important to be aware of Drenick’s Theorem that has proven that 
complex repairable systems, under certain constraints, tend towards being well 
represented by the exponential distribution [Ref. 18]. Given that most military systems 
(aircraft, artillery guns, or naval ships) are usually composed of a large number of 
components, it would seem that the constant failure rate assumption is applicable. The 
usefulness of Drenick’s Theorem depends on the following constraints. These constraints 
are: 1) the subcomponents must be in series. 2) The subcomponents fail independently. 3) 
A failed component is replaced immediately. 4) The replaced subcomponent must be 
identical. 5) A few system repairs have already been made. Once these conditions are met 
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and as the number of subcomponents increases, system failures will tend to be 
exponentially distributed regardless of the failure distributions of the subcomponents. 
This proof allows reliability practitioners to disregard the failure distribution of the 
individual components that make up the system since the overall system will fail 
exponentially 

However, one must be aware that using the constant failure rate assumption has 
the tendency to produce a conservative estimate of the overall system failure rate and this 
is important from the logistics perspective especially in the purchase of spares. 
Alternatively, the Weibull distribution provides a possibly more accurate prediction but it 
will increase the mathematical complexity. There is always a tradeoff between accuracy 
and mathematical complexity. 

Even if the exponential distribution can be used to model a system, the disparity 

between predicted and field reliability may still exist in new systems because of 

unexpected failure modes that may arise in the presence of design and quality 

deficiencies which will prevent the system from reaching the predicted value. One 

possible solution to eliminate or reduce the frequency of occurrence of unexpected 

failures in the field is to apply reliability growth testing (RGT) during the development 

phase. In contrast to the exponential distribution, AMSAA reliability growth models used 

for reliability growth analysis assume that system failure rate follows a Non 

Homogeneous Poisson Process (NHPP). Reliability growth testing recognizes that the 

drawing board design of a complex product cannot be perfect from the reliability point of 

view and allocates necessary time and resources to fine tune the design by finding those 

problems that are impossible to know in advance during reliability prediction and 

designing them out. It follows the formal process of Test-Analyze-And-Fix (TAAF) 

which involves testing the system to surface all possible failure modes, analyzing the 

underlying failure mechanism to detennine its root causes, implementing corrective 

actions to improve the design and finally re-testing to verify the effectiveness of the 

corrective actions to prevent future occurrences. Once the system matures through a 

period of testing and reliability growth has reached a plateau, the system’s failure rate 

will tends towards well represented by an exponential distribution. Consequently, the 

disparity between predicted and field MTBF can be minimized. It is also important to 
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realize that in order to maximize the benefits of a reliability growth program, it has to be 
conducted as early as possible in the development phase once the prototype is available. 
The earlier these problems are identified, the better it is so that more time will be 
available to verify the effectiveness of the design changes. Furthermore, the cost 
associated with redesigning a product late in the development cycle is extremely high. 

The remaining chapters of this thesis will discuss the reliability growth testing of 
a 155mm SPH artillery gun using reliability growth methodology. The next chapter first 
introduces the reliability growth methodology and follows by illustrating the use of the 
Duane’s model to detennine the essential parameters needed for constructing the 
idealized growth curve as part of reliability growth planning. 
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III. RELIABILITY GROWTH PLANNING 


A. INTRODUCTION 


Reliability growth is the improvement of a product’s reliability over time (hence 
the term growth) using the TAAF philosophy through learning about the deficiencies of 
the design and taking action to eliminate or minimize the effect of these deficiencies. The 
growth in reliability is quantified by a decrease in system’s failure rate or increase in the 
test phase average MTBF over time due to the removal of failure sources. Figure 4 
reflects a decreasing trend in failure rate which signifies reliability improvement over 
time. 



TEST TIME 

Figure 4. Failure rate versus time [From Ref. 5] 

The success of a reliability growth program is dependent on factors including the 
initial planning of the reliability program and an accurate assessment of the system’s 
current reliability status. It is important to track reliability throughout the test program. 
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This is accomplished by assessment of system reliability at the end of each test phase and 
comparing the current reliability and the planned reliability. Planning and tracking of 
reliability growth requires the use of mathematical models. 

One mathematical model used for developing an idealized growth curve in 
reliability growth planning is Duane’s model. A second mathematical model used for 
tracking reliability growth is the Non-Homogeneous-Poisson-Process (NHPP) model 
kn own as the AMSAA model [Ref. 16]. In contrast to the constant failure rate model 
used in reliability prediction, the AMSAA model describes the failure rate of the system 
as a function of time. 

B. RELIABILITY GROWTH PROGRAM OL THE COMBAT SYSTEM 


The development of a large combat system generally involves years of design, 
testing, fault diagnosis, and redesign to assure that when the system development is 
completed, the final system meets or exceeds the user requirements. Reliability Growth 
Testing (RGT) was implemented on one unit of the prototype as part of the Reliability 
and Maintainability (R&M) program. The combat system consists of two major 
subsystems which are known as subsystem A and B. The ultimate goal of the combat 
system reliability growth testing program is to achieve the stated reliability requirements 
for both subsystems. 

The reliability growth program for the combat system focuses on the following 

areas: 

Reliability growth planning: To develop an achievable solution based on 
available resources and schedule constraints. 

Test-Analysis-And-Fix (TAAF): Failure causes are isolated, analyzed and 

then fixed. 

Reliability growth tracking: To determine if reliability requirements have 
been demonstrated. 
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The RGT is to subject a single unit of prototype under actual field conditions. All 
the failures that were surfaced during the test were analyzed and fixed and re-tested. As 
the testing progresses, these fixes are incorporated into the prototype so that reliability 
will improve during the course of the test. 

Major development efforts were mainly focused on subsystem A as it involves the 
integration of many important subsystems. The testing for the subsystem A was planned 
over three phases. The first phase is considered as a pre-development testing to estimate 
the initial reliability of the prototype in order to gauge the amount of development efforts 
required to meet the target reliability goals. The additional two phases focuses on meeting 
the final reliability target. Reliability testing for the subsystem B was also planned over 
three phases. 


C. RELIABILITY GROWTH PLANNING METHODOLOGY 


The first step in the reliability growth process is reliability growth planning. 
Reliability growth planning involves the development of an idealized growth curve. The 
major role of the idealized reliability growth curve is to quantify the overall development 
efforts so that the growth pattern can be evaluated relative to the basic objectives and 
resources. It also provides the program manager with a useful tool to monitor the 
reliability growth of the weapon system during its development. 

The reliability of a system under development is generally increasing rapidly at 
the beginning and slows down towards the end. The idealized growth curve shown in 
Figure 5 depicts reliability growth as a smooth non-decreasing concave down curve with 
respect to time. A typical reliability test program consists of several test phases. Fitting a 
smooth curve to the proposed reliability values of the system at the end of each test 
phase, the resulting curve represents the overall pattern for reliability growth. 
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Figure 5. Example of an idealized growth curve 


The formula for developing the idealized growth curve is based on Duane’s model 
[Ref. 5], 


M(t) = 


Where 


M { 


Mi 


J 


(1- a) 


-i 


0 < t < Ti 


t > Ti 


(3.1) 


M F = Desired MTBF value at T 

t = Cumulative test time 

tj = Cumulative test time at starting point 

M! = Average initial MTBF of the system at the beginning 

a = System reliability growth rate between 0 and 1.0 
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The development of the idealized growth curve starts with the determination of 
the length of the initial test phase (t 7 ) and the average MTBF (M 7 ) over this initial test 

phase. The success in the development of the idealized growth curve depends on the 
ability to accurately estimate these parameters as they will affect the total test time and 
growth rate required to achieve the require reliability. A system with a lower initial 
average MTBF will require longer test time given a fix growth rate. 

There is no standard way of determining the values of these parameters. The 
average initial MTBF of the system, M I , is the average MTBF over the initial test phase 
before any modification is developed, implemented or tested. The practice of arbitrarily 
choosing a starting point, such as 10% of the requirement is not recommended [Ref. 5]. 
One way of accurately detennining these parameters is to perfonn an initial test so that 
Mj and T j are known. The initial test phase of the RGT program is conducted to 
“stabilize” the test data, so it must be long enough for the first failure mode to surface. 


The value M F represents the desired MTBF at the end of the reliability growth 
test. The total amount of testing, T, is determined through a joint effort between the 
contractor and the program manager and it is derived based on considerations on 
available resources, and calendar time, as well as the number of prototypes available. 

The growth rate of the system (a ) determines the length of time needed to grow 
from the initial MTBF to the required MTBF. The growth rate gives an indication on how 
fast the system reliability is improving. The growth rate is governed by the efficiency by 
which failures are corrected. A large growth rate (a >0.5) reflects an aggressive 
reliability program while a low value of growth rate (a < 0.1) reflects a program where 
no quick fixes are available. 

For fixed values of T, M 77 ,M 7 and t n the value of a may be approximated by 
solving the following equation: 
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This is a reasonably good approximation when a is less than 0.4 [Ref. 16]. The 
more precise way to solve for the value of a in equation 3.1 is by using numerical 
methods, e.g. with MsExcel. 


1. Development of the Idealized Growth Curve for Subsystem A 


The development of the idealized growth curve is based on the initial estimate of 
the MTBF and the limitations constrained on testing such as number of units under test, 
resources and time available for testing. For subsystem A, the parameters for constructing 
the idealized growth curve are based on the given mission conditions. 

A mission reliability of 200 rounds Mean-Rounds-Between-Failure (MRBF) was 
required at the end of the reliability growth test. Since this is a combat system, the total 
test time for this subsystem is expressed in tenns of number of rounds instead of calendar 
time and it was limited to a maximum of 2300 rounds due to resource constraints 
available in the development program. 

Average initial MRBF, ( M 7 ). The initial MRBF is detennined based on pre- 
developmental testing of the proposed system. The pre-development testing resulted in 4 
mission affecting failures in 280 rounds. The MRBF was projected to be constant during 
this initial testing because no significant design changes were incorporated during the 
test, so the MRBF was estimated as: 

280 

Initial MRBF =-= 70 rounds (3.3) 

4 


Growth rate, ( a). The initial MRBF is estimated to be 70 rounds and a final 
MRBF of 200 rounds is desired after 2300 rounds of testing. For this program, the first 
test phase is 280 rounds. The desired growth rate parameter can be determined from the 
following equation. 
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The growth rate, a , is found to be 0.32 for the given conditions, anything less 
would violate the resource constraints. The approximation of the a value of 0.32 in 
equation 3.4 is consistent with the results of the numerical method. An a value of 0.32 
indicates a relatively aggressive development program that would require emphasis on 
the analysis and fixing of problem failure modes [Ref. 16]. Since major development 
efforts will be focused on the subsystem A, an a value of 0.32 is reasonable. The total 
test time is sensitive to the parameter a . As shown in table 4, using a test time of less 
than 2300 rounds would result in a projected a greater than 0.32 which means that it will 
require an even more aggressive reliability growth program. 


M I 

70 



280 

a 

0.3 

0.32 

0.34 

0.36 

0.38 

0.4 

T (rounds) 

2880 

2280 

1780 

1500 

1280 

1080 


Tab 


e 4. 


Sensitivity analysis of a on total test duration 


The test parameters of the reliability growth plan are summarized in Table 5 

below. 



Subsystem A 

Reliability Target 
(MRBF) 

200 rounds 

Total Test Duration 

2300 rounds 

Growth Rate 

0.32 


Table 5. Summary of test parameters for subsystem A 


The plan assumed that the MRBF of subsystem A would grow from its initial 
level to the required 200 rounds MRBF in accordance to the following form of Duane’s 
expression for reliability growth: 


M(t) 


70 

70 


7 t \ 032 


280 


(1-0.32) 1 


0<t<280 
t > 280 


(3.5) 
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T 

M(t) 

0 

70 

280 (-) 

70 

280 (+) 

103 

380 

114 

500 

124 

650 

135 

800 

144 

950 

152 

1100 

159 

1250 

166 

1400 

172 

1550 

178 

1700 

183 

1850 

188 

2000 

193 

2300 

202 


Table 6. Computed MRBF values for the idealized growth curve 



Figure 6. Idealized growth curve for subsystem A 
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The idealized growth curve for subsystem A is shown in Figure 6. The reliability 
growth test of subsystem A consists of two additional test phases at 280-1180 rounds and 
1100-2300 rounds. More test time is being allocated to the last test phase as more time 
will be required to verify the effectiveness of the previous fixes. 


2. Development of the Idealized Growth Curve for Subsystem B 

The approach taken for developing the idealized growth curve for subsystem B is 
similar to that for the subsystem A. A mission reliability of 350 kilometers Mean- 
Kilometers-Between- Failure (MKBF) was required at the end of the reliability growth 
test. The total test time for the subsystem is expressed in tenns of kilometers. 

Average initial MKBF, ( M f ). The initial MKBF was estimated during the 
prototype run-in test. The run-in test resulted in 6 mission affecting failures in 1000 
kilometers. The MKBF was projected to be constant during this initial testing because no 
significant design changes were incorporated during the test, so the MKBF was estimated 
as: 

Initial MKBF = 1^2. = 167 kilometers (3.6) 

6 

Growth rate, ( a ) and total test time . The initial MKBF is estimated to be 167 
kilometers and a final MKBF of 350 kilometers is desired at the end of the testing. Table 
7 shows the various growth rates and the corresponding total test time computed based on 
the initial MKBF and test time. 


M 

I 

167 


T 

I 

1000 

a 

0.25 

0.26 

0.27 

0.28 

0.29 

0.3 

0.32 

0.34 

0.36 

0.38 

T (km) 

6150 

5450 

4850 

4350 

3950 

3600 

3050 

2600 

2270 

2000 


Table 7. 


Growth rate versus total test duration 
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Based on schedule constraints, the maximum allowable test time was 
approximately 4850 kilometers. The corresponding desired growth rate is 0.27. The 
approximation of the a value of 0.27 in equation 3.4 is consistent with the results of the 
numerical method. 



Subsystem B 

Reliability Target 
(MKBF) 

350 kilometers 

Total Test Duration 

4850 kilometers 

Growth Rate 

0.27 


Table 8. Summary of test parameters for subsystem B 


The plan assumed that the MKBF would grow from its initial level to the required 
350 kilometers MKBF in accordance to the following form of Duane’s expression for 

f167 0<t<1000l 


reliability growth M(t ) 


167 


f t 

v 1000 


,0.27 


(1-0.27) 1 t>1000 


(3.7) 


T 

M(t) 

0 

167 

1000 (-) 

167 

1000 (+) 

229 

1300 

246 

1600 

260 

1900 

272 

2200 

283 

2500 

293 

2800 

302 

3100 

310 

3400 

318 

3700 

326 

4000 

333 

4300 

339 

4600 

345 

4850 

350 


Table 9. Computed MKBF values for the idealized growth curve 
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Figure 7. Idealized growth curve for the subsystem B 

The idealized growth curve for subsystem B is shown in Figure 7. Similarly, the 
reliability growth test of the chassis consists of two additional test phases at 1000-2600 
kilometers and 2600-4850 kilometers. More test time is being allocated to the last test 
phase as more time will be required to verify the effectiveness of the previous fixes. As 
compared to subsystem A, a lower a value of 0.27 for subsystem B is reasonable since it 
is an Off-The-Shelf (OTS) system. 
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IV. RESULTS AND DISCUSSION OF RELIABILITY GROWTH 

ANALYSIS 


A. INTRODUCTION 


This chapter first introduces the reliability growth models used for the reliability 
growth analysis of the combat system and follows with the results and discussion. The 
objectives of the reliability growth analysis include: 

1. Estimating the demonstrated MRBF and MKBF of the two subsystems at 
the end of each test phase. 

2. Projecting the MRBF and MKBF of the subsystems if delayed fixes were 
incorporated at the end of a test phase. 

3. Generate reliability growth plots (e.g. MTBF vs Time) to determine if 
reliability is improving, decreasing or constant. 

The demonstrated MRBF or MKBF provide an estimate for the system 
configuration on test at the end of a test phase. This value is determined by analysis of the 
test results using AMSAA reliability growth models. The demonstrated value is then 
compared to the idealized growth curve at the end of each test phase to detennine if 
reliability growth is progressing satisfactorily. 

A projected reliability value is an estimation of the increased in system reliability 
by taking into account the effect of delay fixes. 

Equations 4.1 to 4.24 in the following sections are taken from Reference 20, and 
were fonnulated by Dr. Farry Crow. 


B. AMSAA RELIABILITY GROWTH MODELS 


There are three types of AMSAA growth models used for reliability growth 
analysis. They are 1) AMSAA Basic Model for Test-Fix-Test 2) AMSAA Projection 
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Model for Test-Find-Test and 3) AMSAA Extended Model for Test-Fix-Find-Test [Ref. 
20]. The distinction between these three models is when fixes are incorporated into the 
system. 

The test-fix-test model is employed when a corrective action is immediately 
found for a failure mode and is incorporated into the system, which is then retested to 
verify its effectiveness and to surface new failure modes. This model estimates the 
achieved reliability of the system after all fixes have been incorporated into the system 
before the end of a test phase. However, it cannot estimate the increased in reliability due 
to delayed fixes that were incorporated at the end of a test. 

The test-find-test model is employed when corrective actions for all surfaced 
failure modes are incorporated into the system at the end of the test. These corrective 
actions results in a distinct jump in system reliability. This model estimates the jump in 
reliability due to delayed fixes. 

The test-fix-find-test model is a combination of the two types discussed above. It 
is employed when some corrective actions are incorporated into the system during the 
test while some are delayed until the end of the test. 

It is important to note that the choice of model for analysis should not be 
determined by the data but rather a realistic assessment of the test program’s corrective 
actions. 


1. AMSAA Basic Model for Test-Fix-Test 

The AMSAA model employs the Weibull process to model reliability growth 
during a developmental phase. This model was formulated by Dr. Larry Crow and it is 
frequently used on systems when usage is measured on a continuous scale. It is also 
designed for tracking reliability within a test phase and not across test phases [Ref. 19]. 
The test-fix-test model evaluates reliability growth that results from the introduction of 
design fixes into the system during a particular test phase. 
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The AMSAA test-fix-test model assumes that system failures during a 
development testing phase follow the non-homogeneous Poisson Process (NHPP) with a 
Weibull intensity function of the following fonn: 

r(t) = Apt M (4.1) 

t = cumulative test time 

A = the scale parameter. It depends on the unit of measurement chosen for t 

P = the shape parameter (also known as the growth parameter) because it 

characterize the shape of the graph of the intensity function 

The relationship between the growth rate and shape parameter is given as: 

a DUANE PAMSAA (4-2) 

Suppose development testing for a particular test phase stops at time T and no 
further improvements are being made into the system. In other words, the system 
configuration is fixed after time T. The demonstrated or achieved failure intensity is 

i CA =r(T) = XpT' ,] . (4.3) 


The demonstrated instantaneous MTBF at the end of the test phase after T units of 
testing is given as the reciprocal of the intensity function: 


m ca = 


ApT 


P -1 


(4.4) 


P is a very important parameter as it indicates whether there is reliability growth 
during the development process. Three possible conditions are reflected by the value of P 

P < 1: Positive reliability growth because failure rate is decreasing 

P = 1: The constant case. No reliability growth because failure rate is constant 

P > 1: Negative reliability growth because failure rate is increasing with time 
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If the testing were stopped at time T and significant modifications are made on the 
system, there may be a jump in system’s reliability. However, the AMSAA test-fix-test 
model does not estimate the jump due to these delayed fixes. 

Parameters Estimation using method of Maximum Likelihood 

Estimates of the two parameters [5 and A are made using on the method of 
maximum likelihood in the MIL-HDBK-189 [Ref.5]. They are estimated based on times 
to failure data which has been accumulated during a given test phase. It is important then 
to collect the actual times to failure and total test time during development testing. The 
estimate of the shape parameter /?, is given by 



N 


NlnT-^lnXi 

i =1 


N = Total number of failure occurrences 
T = Total accumulated test time 


(4.5) 


X; = Cumulative test time at which the ith failure occurred 
The scale parameter is given by 



(4.6) 


Cramer-Von Mises Goodness of Fit Test 


Next, the Cramer-Von Mises goodness of fit test is performed to determine if 
there is enough information to reject the hypothesis that the reliability growth process can 
be described by the AMSAA model [Ref.5]. The Cramer-Von Mises statistics is given by 
the following expression: 


C = 

'-M 


1 


M 

y 

12 Mtr 



“12 


2 / — 1 
2 M 


(4.7) 


M = number of failures 
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If the statistics C M 2 exceeds the critical value corresponding to M for a chosen 

significant level, then the null hypothesis that the AMSAA model adequately described 
the growth process shall be rejected. Otherwise, the model shall be accepted. 

2. AMSAA Projection Model for Test-Find-Test 

The AMSAA Projection Model for Test-Find-Test classifies all failure modes into 
two groups [Ref.20]: 

Type A failure modes. No corrective actions will be taken for Type A failure 
modes. Type A failure mode has a constant failure intensity, X A . 

Type B failure modes. Failure modes whose corrective actions will only be taken 
at the end of the test. 

For the test-find-test model, the system failure intensity is constant (/? = 1) during 
the test because no corrective actions are incorporated into the system. The system then 
experiences a jump in reliability after the incorporation of delayed fixes. The achieved 
system failure rate A s , prior to the delay fixes can be estimated as follows: 


A A + A b 


(4.8) 

Total number of Type A failures 

n a 

(4.9) 

Total test time 

T 

Total number of Type B failures 

n b 

(4.10) 

Total test time 

T 


The projected failure intensity after the incorporation of delayed fixes is obtained 
by assigning an effectiveness factor (EF) d . to every individual unique Type B failure 

modes. The assigned effectiveness factor based on engineering assessment results in a 
fractional decrease in the failure rate A j of the j-th Type B failure mode after fixes have 

been incorporated. The total number of Type B failures observed during a test is 
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(4.11) 


M 

N,=Y, N J 

7=1 

M = Total number of unique Type B failure modes and 
Nj = Total number of failures for the j-th observed distinct Type B mode. 

The projected failure intensity is then computed as follows: 

A A M N — A 

4 = r, + £(W ; )-f+<tfKn (4.12) 

7=1 1 


d = Average EF= 


Yj d 

7=1 _ 

M 


(4.13) 


h(T) = ljdT p - x 


(4.14) 


X and ft are calculated using equation 4.5 and 4.6 based only on the M first 
occurrence failure times of the seen and unique Type B failure modes [Ref. 20]. 

The objective of this model is to estimate the jump in MTBF which is the inverse 
of the projected failure intensity given by 


M p 



(4.15) 


3. Extended Reliability Growth Model for Test-Fix-Find-Test 

The Extended Model utilizes A, BC and BD failure mode classification to analyze 
reliability growth projection data [Ref. 20]. 

Type BD failure modes. Corrective actions for Type BD failure modes are 
delayed till the end of the test. They are the same as Type B failure modes in the test- 
find-test model. 

Type BC failure modes. Corrective actions for Type BC failure modes are 
incorporated during the test. 
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Type A failure modes, as before, are those that will not receive any correction 
actions. Type A and Type BD failure modes do not contribute to reliability growth during 
the test. The growth in reliability during the test is affected only by corrective actions for 
Type BC failure modes. The objective of this model is to estimate the increased in 
reliability due to the corrective actions for Type BD failure modes at the end of the test. 

The projected failure intensity after the incorporation of delayed fixes into the 
system for the Extended Model is 

M _ 

4,=4-4+E( 1 -<tfr +rf ht/bd) <4.i6) 

7=1 1 


The first term X CA is the failure rate prior to delay fixes. It is the same as equation 

4.3 applied to all A, BC and BD failure modes. The remaining terms are calculated in the 
same manner as the AMSAA Test-Find-Test model using only data for BD failure 
modes. 


Finally the projected MTBF after the incorporation of delayed fixes into the 
system for the Extended Model is the inverse of the failure intensity given by 


MEM 


x„ 


(4.17) 


In addition, the AMSAA Extended Test-Fix-Find-Test Model can be modified to 
analyze test-fix-test data and test-find-test data by designating failure modes as BC and 
BD respectively. 
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c. 


RELIABILITY GROWTH ANALYSIS LOR SUBSYSTEM A 


Reliability growth for subsystem A is tracked over three phases of testing. 
Reliability is tracked on a phase by phase basis using test data collected within each test 
phase. The ReliaSoft’s RGA 6 PRO software is used for analyzing the collected data and 
generating reliability growth plots [Ref. 21]. The type of reliability growth model 
selected for must be based on the type of management approach employed in each test 
phase: 

Phase 1: AMSAA Extended Test-Find-Test Projection Model 

Phase 2: AMSAA Extended Model for Test-Fix-Test 

Phase 3: AMSAA Extended Model for Test-Fix-Find-Test 

The AMSAA Basic Model for Test-Fix-Test does not utilize any failure mode 
designation but the AMSAA Extended Model does. Specific knowledge on Type BC and 
Type BD failure mode can help to generate useful metrics for decision making and 
engineering purposes [Ref. 20]. The AMSAA Extended Models are used to analyze both 
test-find test and test-fix-test data by setting all failure modes to BC and BD respectively. 
The underlying mathematical principles of the AMSAA Basic Test-Fix-Test Model and 
AMSAA Basic Test-Find-Test Model remain unchanged. 

1. Phase 1 results and analysis 

In Phase 1, the prototype system was subjected to 280 rounds of testing according 
to the test plan. Since this test phase is short, fixes are not incorporated into the system 
during the test. During the test, three failures were identified but all corrective actions 
were delayed till the end of the test. This management strategy is known as test-find-test. 
The AMSAA Extended Test-Find-Test Projection Model is selected to analyze the 
reliability of the system after the incorporation of delayed fixes. All failure modes 
identified during the test will receive a delay corrective action therefore all failures are 
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being classified as Type BD. These failures were also assigned a failure category 
according to their root cause as shown in Table 10. 


j 

Time to Event, 

x.i 

Classification 

Mode 

Failure Category 

1 

21 

BD 

1 

Faulty component 

2 

132 

BD 

2 

Design 

3 

215 

BD 

3 

Design 

Table 10. 1 

fest-find-test data for Phase 1 


BD Mode 

Number of 
Failures, AT 

Time to First 
Occurrence 

EF, dj 

1 

1 

21 

0.65 

2 

1 

132 

0.7 

3 

1 

215 

0.7 

ell. Test-i 

find-test Type B failure mode data and ] 

EF for Phase 1 


Table 11 shows the frequency and the assigned effectiveness factor (EF) for each 
Type BD failure mode. The EF is an engineering estimate based on the probability that 
the fix is effective in mitigating or reducing the probability of occurrence for the 
particular failure mode. An EF of 1.0 is not practical in most cases since a fix will 
unlikely be able to completely eliminate a failure mode. Studies have shown that an 
average effectiveness factor of 0.7 is reasonable for a typical reliability growth program. 
[Ref. 20] Failure Mode Type BD1 was assigned a lower EF due to high uncertainty 
associated with the effectiveness of the correction action. 


Since the test data consists of only Type BD failure modes, the achieved system 
failure intensity can be estimated by equation 4.8. 


X. = X a 


N 3 

—— = —— = 0.0107 
T 280 


The estimated achieved MRBF at T=280 rounds before the jump is the inverse of 
the achieved system failure intensity. 

= — = 93.3 rounds 

A 
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Next, the projected failure intensity due to the delay fixes is calculated using 
equation 4.12. 


„ A M ft _ 

^=4+£(l-rf,)-f+rf/,(7-) 

j=\ 1 

The average EF of the delay fixes is given in equation 4.13 


d = Average EF= 


2 >, 

]=i _ 

M 


0.65 + 0.7 + 0.7 
3 


0.683 


The term/i(r/ BD) = X(dT p 1 from equation 4.14 is a function of /? and X . These 
two parameters are estimated from equations 4.5 and 4.6 using first occurrence data from 
Table 11. 


P = 


N 


N\nT-±\*X, [31n280-(ln21 + lnl32 + ln21S)] 


= 0.8319 


;=i 


N 3 

A = — = —= 0.0276 

t p 280 0 ' 831 


/z(280 / BD) = 0.0089 


This metric h(T / BD) represents the intensity for Type BD failure modes that 
have not been seen during the testing which also means the rate at which new distinct 
Type BD modes are occurring at the end of the test. 

With all the above parameters defined, the projected failure intensity can be 
calculated. 


i 


p 


~ M N 3 N 

- dj)— J —+ 0.683 * 0.0089 =0.00952 

j=\ T j= i 280 


The projected MRBF due to the jump is the inverse of the project 


M 


P 


— = 105 rounds 

K 


42 



For a two-sided confidence level of 90 %, the projected MRBF is between 40 and 
278 rounds. 


Projection Summary 

P : 

0.8319 


PMTBF: 

105.4 

i: 

0.0276 


DMTBF: 

93.33 


Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von 
Mises (BD) 

Passed 


0.059 

0.154 


Table 12. RGA 6 PRO projection summary and Cramer Von Mises test results for Phase 1 


The RGA 6 PRO generated results as shown in Table 12 is similar to the hand 
calculated values. The Cramer-Von Mises statistics of 0.059 is below the critical value of 
0.154 for a significance level of 0.1. Hence the hypothesis that the AMSAA model is 
applicable is accepted. 

Figure 8 shows the plot of reliability versus time for subsystem A during the test. 
The MRBF is constant (/? = 1) during the test because no fixes were implemented on the 
system and thus the system failure rate remains constant during the test. There is a jump 
in reliability at the end of the test due to fixes being incorporated into the system. The 
projection model estimates that the system MRBF jumps to a value of 105 rounds due to 
three distinct corrective actions with the corresponding EF stated in Table 11. The 
estimated MRBF of 106 rounds after the incorporation of fixes has exceeded the planned 
target of 104 rounds at the end of Phase 1. 
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Instantaneous IVRBF vs Time 


CO 

Q1 


CD 

CO 





Figure 8. MRBF projection for Phase 1 


In addition, the AMSAA Extended Test-Find-Test Projection Model can also be 
used to estimate the fraction of seen and unseen Type BD failure intensity at the end of 
the test. The intensity for Type BD failure modes that have been seen in the testing can be 
estimated as follows: 


A bd - 6(280 / BD) = 0.0107 - 0.0089=0.0018 


(4.18) 


The fraction of Type BD failure intensity due to failure modes that have been seen 
in test is [Ref. 20]: 


Fraction Seen = 


0.0018 



0.168 


(4.19) 


The fraction of Type BD failure intensity due to failure modes that have not been 
seen in test is [Ref. 20]: 
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Fraction Unseen=l- 


/?(280 / BD) 

^BD 


=0.831 


(4.20) 


Figure 9 displays the failure rate for each Type BD failure modes before and after 
implementing the fixes. It provides a clear visibility on the failure rate breakdown of each 
individual Type BD failure mode to the system’s overall failure rate. Failure mode BD1 
appears to have the highest failure rate from Figure 9 as it is directly dependent on the 
assumed EF. In this case, the EF for failure mode BD1 has been assumed a lower value 
as compared to BD2 and BD3 and should be the main focus in the failure management 
strategy. The ability to designate failure modes has certainly provided clearer 
management and engineering insights when formulating the failure management strategy. 


Individual Mode Failure Intensity 



Figure 9. Before and after failure rate for Type BD failure modes in Phase 1 based on 

frequency and EF 
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Phase 2 results and analysis 


The testing approach used in Phase 2 is Test-Analyze-And-Fix (TAAF), in which 
fixes for all failure modes discovered are being incorporated during the test. The system 
is tested for 820 rounds in this phase. The AMSAA Extended Test-Fix-Test Model 
designates all failures as Type BC as shown in Table 13. 


j 

Time to Event, 

x.i 

Classification 

Mode 

Failure Category 

1 

27 

BC 

1 

Faulty component 

2 

72 

BC 

2 

Design 

3 

122 

BC 

2 

Design 

4 

265 

BC 

3 

Software 

5 

317 

BC 

4 

Design 

6 

394 

BC 

5 

Design 

7 

455 

BC 

2 

Design 

8 

719 

BC 

6 

Faulty component 


Table 13. Test-fix-test data for Phase 2 


BC Mode 

Number of Failures, 

N j 

Time to First Occurrence 

1 

1 

27 

2 

3 

72 

3 

2 

275 

4 

1 

317 

5 

1 

394 

6 

1 

719 


Table 14. Unique first time occurrence BC failure mode for Phase 2 


During Phase 2, six unique Type BC failure modes were observed in eight 
hundred and twenty rounds of testing. The demonstrated MRBF calculations will be 
calculated next. 


The shape parameter is estimated using equation 4.5 



AlnT-^lnA 





8 


[8 In 820 - (In 27 + In 72 + In 122 + In 265 + In 317 + In 394 + In 455 + In 719)] 
= 0.7089 


The calculated j3 of 0.7089 (y 3 < 1 ) implies positive and improved reliability 
growth in this phase. 

The relationship between the growth rate and the shape parameter is given by 
equation 4.2. 

<*DUAM = 1 - Pamsaa = 1 - 0-7089 = 0.2911 


The calculated growth rate of 0.2911 is close to but falls below the desired value 
of 0.32. It implies that reliability growth is not growing as fast as it was planned to be. 

The scale parameter is estimated using equation 4.6. 


i = 


N 



8 

820 0 ' 7089 


0.0687 


The achieved failure intensity is given by equation 4.3. 
X CA = r(T ) = i/TT ^ 1 = 0.0687 * 0.7089 * 820°- 708<M = 0.0069 


The demonstrated instantaneous MRBF at the end of phase 2 after 820 rounds of 
testing is the reciprocal of the intensity function given by equation 4.4. 


M 


CA 


-= 145 rounds 

Ka 


For a two-sided confidence level of 90 %, the demonstrated MRBF is between 63 
and 431 rounds. 


Another useful metric that can be detennine from the test data is the initial system 
MRBF at the beginning of this phase [Ref.20]. 

Wi+t) 

Mj = - - -= 54 rounds (4.21) 

ft 
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The initial MRBF of 54 rounds at the beginning of Phase 2 falls within the 
confidence interval of 40 and 280 rounds at the end of Phase 1. At the beginning of the 
test it is estimated that the initial system MRBF was 54 rounds and due to six distinct 
fixes the reliability grew to 145 rounds at the end of 820 rounds of test. 


Analysis Summary 

Model: 

Crow-AMSAA (NHPP) 


Analysis Method: 

MLE 

P- 

0.7089 


Test Procedure: 

Developmental 

A: 

0.0688 


Input Type: 

Cumulative 

Growth Rate: 

0.2911 


Tennination Time: 

820 

Instant. 

MTBF: 

144.58 





Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von Mises 

Passed 


0.035 

0.165 


Table 15. RGA 6 PRO analysis summary and Cramer Von Mises test results for Phase 2 


The RGA 6 PRO generated results presented in Table 15 are consistent with the 
hand calculated values. The Cramer-Von Mises statistics of 0.035 is below the critical 
value of 0.165 for a significance level of 0.1. Hence the hypothesis that the AMSAA 
model is appropriate is accepted. 
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Instantaneous MRBF vs Time 



Time (Rounds) 


Figure 10. MRBF projection for Phase 2 


Figure 10 indicates that reliability is increasing with time. The effective 
application of the TAAF approach in surfacing and fixing failure modes has contributed 
to reliability growth in this phase. According to the idealized growth curve, the expected 
MRBF at the end of Phase 2 should approach 159 rounds. The demonstrated MRBF of 
145 rounds is close to approaching the expected target. 

However one main concern identified in this phase is the relative high frequency 
of mode BC2 as shown in Figure 11. An effective failure management strategy at this 
point of the program should focused on fixing on failure mode BC2 by allocating more 
resources to identify its root cause and improve on current corrective actions. 
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Breakdown of System's Failure Rate 


3.20E-3 


2.40E-3 


1.60E-3 


8.00E-4 


0 

Figure 11. 


Crow Extended 
Data 1 

Developmental 

IVLE 

□ Before 



BC1 BC2 BC3 BC4 BC5 BC6 


Failure rate for individual BC failure modes after the test 


3. Phase 3 results and analysis 


In Phase 3, some fixes are incorporated into the system during the test while 
others are delayed until the end of the test. The reasons for the delayed fixes are due to: 
1 ) unavailability of spares parts or tools required for component replacement or repair 
and 2) inability to identify the root cause of failure during the test. This type of data is a 
combination of test-fix-test and test-find-test which is kn own as test-fix-find-test. The 
AMSAA Extended Test-Fix-Find-Test Model is used for analyzing the data. There are 
nine failures observed in 1200 rounds of testing. The failures that receive a correction 
action during the test are classified as BC while those that are delayed will be classified 
as BD. All the failures surfaced in this phase are presented in Table 16. 
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j 

Time to Event, 
Xj 

Classification 

Mode 

Failure Category 

1 

55 

BD 

1 

Faulty component 

2 

101 

BC 

1 

Design 

3 

212 

BC 

1 

Software 

4 

317 

BC 

2 

Faulty component 

5 

379 

BC 

3 

Software 

6 

465 

BC 

4 

Design 

7 

520 

BD 

2 

Faulty component 

8 

579 

BD 

3 

Quality 

9 

900 

BC 

5 

Workmanship 


Table 16. Test-fix-find-test data for Phase 3 


BD Mode 

Number of 
Failures, AT 

Time to First 
Occurrence 

EF, dj 

1 

1 

55 

0.6 

2 

1 

520 

0.6 

3 

1 

579 

0.6 


Table 17. 


Test-find-test Type BD failure mode data and EF for Phase 3 


There are six unique BC failure modes and three unique BD failure modes in this 
phase. The EF for all BD failure modes is conservatively assigned as 0.6 because this the 
last test phase with no further testing to verify their effectiveness. The assigned EF will 
be used for estimating the jump in the system reliability due to the delay fixes. 

The estimate of the failure intensity after 1200 rounds of testing before 
incorporation of delayed fixes is estimated using equation 4.3. 

X CA =r{T) = XjiT^ 

The shape parameter ft is calculated using equation 4.5 based on the data in 
Table 15 


N\nT~Y J \^Xi 

i =1 
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_9_ 

[9 In 1200 - (In 55 + In 101 + In 212 + In 317 + In 379 + In 465 + In 520 + In 579 + In 900)] 

=0.715 


The calculated ft of 0.715 (ft<\ ) implies positive reliability growth in this 

phase. 

The growth rate is given by equation 4.3 
a DUANE = 1 - Pamsaa = 1 - 0.71 5 = 0.285 

The calculated growth rate of 0.285 is consistent with that of Phase 2. 

The scale parameter is given by equation 4.6 



9 

1200°' 715 


0.0563 


The achieved failure intensity before the incorporation of the delay fixes at a 
cumulative time of 1200 rounds is 


A ca = r (T) = Xj8T^ =0.0563*0.715*1200° 715 - 1 =0.00533 

The achieved MRBF is the inverse of the failure intensity given by 

~ r - n -1 

M ca = A ca =186 rounds 

For a two-sided confidence level of 90 %, the demonstrated MRBF is between 85 
and 512 rounds. 
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Analysis Results 


0.715 


DMTBF: 

186.3 

i: 

0.0563 





Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von 
Mises 

Passed 


0.0995 

0.167 


Table 18. RGA 6 PRO failure modes analysis results and Cramer Von Mises statistical test 

results for Phase 3 

The RGA 6 PRO generated results presented in Table 18 are consistent with the 
hand calculated values. The Cramer-Von Mises statistics of 0.0727 is below the critical 
value of 0.16 for a significance level of 0.1. Hence the hypothesis that the AMSAA 
model is appropriate is accepted. 

The demonstrated MRBF of 186 rounds at the end of Phase 3 prior to the 
incorporation of fixes did not meet the requirement of 200 rounds because the achieved 
growth rate of 0.29 is below the planned value of 0.32. However a trend of decreasing 
number failures is obvious from the cumulative number of failures versus time plot in 
Figure 12. There is only one failure observed in the last 600 rounds of testing. Figure 12 
also shows that the results are slightly biased as the number of failures at each instant of 
time is being underestimated. The next step is to estimate the jump in reliability as a 
result of delayed fixes. 
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Cum. Number of Failures 


Cumulative Number of Failures vs Time 


10.00 


1.00 


0.10 



Figure 12. Cumulative number of failures vs time plot for Phase 3 
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Instantaneous MRBF vs Time 
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Figure 13. MRBF vs time plot for Phase 3 


The projected failure intensity after the incorporation of delay fixes into the 
system is calculated using equation 4.16. 

M N. — * 

KEM = Ka - Abd + ! BD ) 

7=1 1 

The failure intensity for Type BD failure modes is given by 

Total number of Type BD failure modes 3 

A bd =-:-:-= = 0.0025 


Total test time 


1200 


The term h(T I BD) = Xj3T p 1 from equation 4.14 is a function of ft and X. 
These two parameters are estimated from equations 4.5 and 4.6 using first occurrence 
data from Table 17. 
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p 


N 


NlnT-^lnX: [ 31nl20 °-( ln55 + ln520 + ln579 )] 


= 0.645 


i =1 


The growth rate due to the three distinct fixes is given by equation 4.3 

a duane — 1 ~ Pamsaa = 1 ~ 0.645 — 0.355 


X = ^ = 0.0309 failures/round 

T p 12 00°' 645 

h{TIBD) = XpT p ~ l =0.001615 


This metric h(T / BD) represents the intensity for Type BD failure modes that 
have not been seen during the testing which also means the rate at which new distinct 
Type BD modes are occurring at the end of the test. 

The average EF is given by equation 4.13 


d = Average EF= 


2X 

M 


0.6 + 0.6 + 0.6 

3 


= 0.6 


Finally, the projected failure intensity after the incorporation of delayed fixes into 
the system for the Extended Model can be determined as 


M ^ 

Km = Ka - ABD + + d h(T / BD) 

j =i 1 


3 N 

= 0.00533-0.0025 + Y(l-d)—^+0.6*0.001615 

tr 4 1200 


=0.00483 

The Extended Model projected MRBF after the incorporation of delay fixes at the 
end of the test is given by equation 4.17. 


M EM 


A 


'EM 


206.7 
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For a two-sided confidence level of 90 %, the projected MRBF is between 105 
and 404 rounds. 

A sensitivity analysis on the effects of varying EF on MRBF was carried out and 
the results shows that the projected MRBF increases to 220 rounds or by 7 percent if the 
assumed EF is 0.9. It can therefore be concluded that the resulting MRBF does not 
varying significantly when using two extreme values of EF. 


Table 19. 


Analysis Results 

J3: 

0.6455 


PMTBF: 

206.7 

i: 

0.0309 





Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von 
Mises (BD) 

Passed 


0.0872 

0.154 


RGA 6 PRO BD failure modes ana 


ysis results and Cramer Von Mises statistical 


test results for Phase 3 


The RGA 6 PRO generated results presented in Table 19 are consistent with the 
hand calculated values. The Cramer-Von Mises statistics of 0.0872 is below the critical 
value of 0.154 for a significance level of 0.1. Hence the hypothesis that the AMSAA 
model is appropriate is accepted. 
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Figure 14. Projected MRBF vs time plot for Phase 3 
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Individual Failure Mode Failure Intensity 
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Figure 15. Individual failure rate of Type BC and BD failure modes at the end of Phase 3 


The Extended Model estimates that the MRBF grew to 186 rounds as a result of 
three corrective actions for BC failure modes during the test. It then jumps to 206.7 
rounds as a result of the delayed corrective actions for the Type BD failure modes even 
with conservative EF estimates of 0.6. The system is estimated to meet the reliability 
requirements after taking into account the effect of delayed fixes. To provide additional 
insights, Figure 15 shows the individual failure rate contribution of both Type BC and 
Type BD failure modes. In comparison, failure mode Type BC1 has the highest relative 
failure rate. On the other hand, the failure rates of Type BD1, BD2 and BD3 has 
decreased significantly after fixing. To substantiate this claim from an engineering 
viewpoint, the fixes for these three BD modes involves only basic component 
replacement or repair. The assigned EF of 0.6 is a conservative estimate for simple fixes 
and hence it can be concluded that the projected MRBF of 206 rounds is a realistic 
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estimate. In order for the reliability of the system to grow further, efforts should be 
focused on improving the correction action for failure mode BC1 despite the last 
corrective action has proven to be effective. 

The system grows from an initial demonstrated MRBF of 93 rounds to 206 
rounds. In conclusion, the system is estimated to meet the reliability requirements at the 
end of the RGT. However, the projected MRBF falls between 105 and 404 rounds for a 
two-sided confidence level of 90 %, 
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D. 


RELIABILITY GROWTH ANALYSIS LOR SUBSYSTEM B 


Reliability growth for subsystem B is tracked over three phases of testing. 
Reliability is tracked on a phase by phase basis. Similarly, data collected during each 
phase is analyzed using the ReliaSoft’s RGA 6 PRO software [Ref. 21]. The reliability 
growth model selected for reliability analysis for the three test phases is: 

Phase 1: AMSAA Extended Test-Find-Test Projection Model 

Phase 2: AMSAA Extended Test-Fix-Test Model 

Phase 3: AMSAA Extended Test-Fix-Test Model 


1. Phase 1 results and analysis 


In Phase 1, the prototype system was subjected to 1000 kilometers of testing. 
During the test five failures were identified but all corrective actions were delayed till the 
end of the test. This management strategy is known as test-find-test. The AMSAA 
Extended Test-Find-Test Model is selected to analyze the reliability of the system after 
the incorporation of delayed fixes. The failures identified during the test were classified 
into their respective failure category as shown in Table 20. 


j 

Time to Event, 
Xj 

Classification 

Mode 

Failure Category 

1 

159 

BD 

1 

Workmanship 

2 

252 

BD 

2 

Quality 

3 

299 

BD 

3 

Design 

4 

555 

BD 

3 

Design 

5 

967 

BD 

4 

Quality 

Table 20. Test-find-test c 

ata for Phase 1 
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BD Mode 

Number of 
Failures, AT 

Time to First 
Occurrence 

EF, dj 

1 

1 

159 

0.7 

2 

1 

252 

0.7 

3 

2 

299 

0.6 

4 

1 

967 

0.7 


Table 21. Test-find-test Type B failure mode data and effectiveness factor for Phase 1 


There is no Type A failure modes since all failures will received corrective 
actions. There are five unique modes of Type B failure. All Type B failure modes are 
classified as BD. The EF for each BD failure mode assigned in Table 21 is based on 
engineering assessment on the level of effectiveness of the corrective action. 

The achieved system failure intensity is only contributed to by Type B failure 
mode which is estimated by equation 4.8: 


A„ = An 


N 5 

—— = —-— = 0.005 
T 1000 


The estimated achieved MKBF at T=1000 kilometers before the jump is 
M s = — = 200 kilometers 


Next, the projected failure intensity is calculated is calculated using equation 4.12. 

A A M ]\f _ 

j =1 1 

The average EF of the delay fixes is given by equation 4.13 


d = Average EF= 


2X 

)=i 

M 


0.7+ 0.7+ 0.6+ 0.7 
4 


0.675 


The termA(r / BD) = Xf]T r ‘ 1 from equation 4.14 is a function of f) and A . These 
two parameters are estimated from equations 4.5 and 4.6 using first occurrence data from 
Table 21. 
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NlnT-f^lnXi [ 4 ln 1000 - (In 159 + In 252 + In 299 + In 967)] 

i=1 

= 0.8973. 

N 4 

A = —- =-—— = 0.00813 failures/kilometers 

T p 10 00°' 8973 

h(TIBD) = XpT*- i =0.00358 


This metric h(T / BD) represents the intensity for Type BD failure modes that 
have not been seen during the testing which also means the rate at which new distinct 
Type BD modes are occurring at the end of the test 

With all the above parameters defined, the projected failure intensity can be 
calculated. 


A A M N — " 4 N 

*, = ^+Z<i-<9)^+<ft<n = Z(i -d t )——+ 0.675 *0.00358 =0.00411 

j =i T 7=1 


1000 


The projected MRBF due to the jump is the inverse of the projected failure 
intensity given by equation 4.15. 

1 

M n = — = 242 kilometers 
P \ 

For a two-sided confidence level of 90%, the projected MKBF is between 103 and 
829 kilometers. 


A sensitivity analysis on the effects of varying EF on MKBF was carried out and 
the results show a 10 percent difference if the EF is varied from 0.6 to 0.9. It can 
therefore be concluded that although MKBF increases with an increasing value of EF but 
using two extreme values of EF does not produce very significant difference in the 
MKBF. 
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Projection Summary 


0.8973 


PMTBF: 

242.56 

i: 

0.00813 


DMTBF: 

200 


Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von 
Mises (BD) 

Passed 


0.0919 

0.155 


Table 22. RGA 6 PRO projection summary and Cramer Von Mises test results for Phase 1 


The RGA 6 PRO generated results as shown in Table 22 is similar to the hand 
calculated values. The Cramer-Von Mises statistics of 0.0919 is below the critical value 
of 0.155 for a significance level of 0.1. Hence the hypothesis that the AMSAA model is 

appropriate is accepted. From Table 22, it can be seen that the /3 value of subsystem B is 
higher than subsystem A which implies a lower growth rate. This is expected because 
subsystem B is an OTS system. 


64 



Instantaneous MKBF vs Time 
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Figure 16. MKBF projection for Phase 1 


Figure 16 shows the plot of reliability versus time for subsystem B during the test. 
The reliability of subsystem B is constant ( j3 = 1) during the test because no fixes were 
implemented on the system and therefore no growth is taking place during the test. There 
is a jump in reliability at the end of the test due to fixes being incorporated into the 
system. The projection model estimates that the MKBF jumps to 242 kilometers at the 
end of phase 1 due to four distinct corrective actions in redesign and quality process and 
workmanship improvement. This projected MKBF value of 242 has exceeded the 
planned MKBF of 229 kilometers which concludes that reliability growth is progressing 
satisfactorily at the end of phase 1. 
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The failure trend in Figure 17 shows that majority of the failures were surfaced 
during the early stages of the testing which is typical of a new system during its initial 
run-in. These are infant mortality failures due to poor quality and workmanship of 
components. 


Cumulative number of failures vs Time 
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Figure 17. Cumulative number of failures vs time plot for Phase 1 


In addition, the fraction of seen and unseen Type BD failure intensity can also be 
estimated. The intensity for Type BD failure modes that have been seen in the testing can 
be estimated as follows: 

X BD - h(1000/ BD) = 0.005-0.00358=0.00142 

The fraction of Type BD failure intensity due to failure modes that have been seen 
in test is: 

„ c A(1000/5D) 

Fraction Seen =---= 0.284 

^BD 

Fraction Unseen=0.716 
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Figure 18 displays the failure rate for each individual Type BD failure modes 
before and after implementing the fixes. It provides a clear visibility on the failure rate 
contribution of each individual Type BD failure mode to the system’s overall failure rate. 
The failure management strategy should focus on fixing mode BD3 as it appears to have 
the highest failure rate from Figure 17. 


System's Failure Rate Breakdown 
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Figure 18. Before and after failure rate for Type BD failure mode in Phase 1 


2. Phase 2 results and analysis 


The testing approach used in Phase 2 is Test-Analyze-And-Fix (TAAF), in which 
fixes for all failure modes discovered are being incorporated during the test. The total 
cumulative test mileage for this phase is 1600 kilometers. The AMSAA Extended Test- 
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Fix-Test Model is used to analyze the reliability for this phase. All failure modes are 
being classified under Type BC in Table 23. 


j 

Time to Event, 

x.i 

Classification 

Mode 

Failure Category 

1 

89 

BC 

1 

Design 

2 

147 

BC 

2 

Faulty Component 

3 

356 

BC 

3 

Design 

4 

626.84 

BC 

4 

Quality 

5 

719 

BC 

3 

Quality 

6 

1285.4 

BC 

5 

Design 

7 

1420 

BC 

6 

Design 


Table 23. Test-fix-test data for Phase 2 


BC Mode 

Number of Failures, 

N j 

Time to First Occurrence 

1 

1 

89 

2 

1 

147 

3 

2 

434.6 

4 

1 

626.84 

5 

1 

1285.4 

6 

1 

1420 


Table 24. Unique first time occurrence BC failure mode for Phase 2 


During this phase, six unique failures were observed in 1600 kilometers of testing. 
With the above test data, the demonstrated MKBF for this test phase can be calculated. 

The shape parameter is estimated using equation 4.5. 

P = - 4 - 

AUnT-^lnA 

1=1 

= _ 7 _ 

[6 In 1600 - (In 89 + In 147 + In 434.6 + In 626.8 + In 719 + In 1285.4 + In 1420)] 

= 0.7905 

The calculated /? of 0.7905 (/?<1) is lower than Phase 1 which implies reliability 
improvement compared to the last phase. 
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The relationship between the growth rate and the shape parameter is given by 
equation 4.2. 

a DUANE = 1 - Pamsaa = 1 - 0-7905 = 0.2095 

The scale parameter is estimated using equation 4.6. 


i = 


N 



7 

I 6 OO 0 ' 7905 


0.0205 


The achieved failure intensity at the end of the test is estimated by equation 4.3. 
i CA = r(T) = ipT p - x = 0.0205 * 0.7905 * 1600 07905 1 = 0.00345 


The demonstrated instantaneous MKBF at the end of phase 2 after 1600 
kilometers of testing is given in equation 4.4.as the reciprocal of the intensity function: 

1 

M ca =-= 289 kilometers 

^CA 

For a two-sided confidence level of 90 %, the demonstrated MKBF is between 
119 and 953 kilometers. 

Another useful metric is the initial system instantaneous MKBF at the beginning 
of this phase. It is given in equation 4.21. 

r(i+fi 

Mj — -j-— 156 kilometers (4.22) 

P 


The initial MRBF of 156 kilometers at the beginning of Phase 2 is within the 
confidence interval of 110 and 534 kilometers at the end of Phase 1. At the beginning of 
the test it is estimated that the initial system MKBF was 156 kilometers and due to six 
distinct fixes the reliability grew to 289 kilometers at the end of 1600 kilometers of 
testing. 
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Analysis Summary 

Model: 

Crow-AMSAA (NHPP) 


Analysis Method: 

MLE 

P- 

0.7905 


Test Procedure: 

Developmental 

i: 

0.0205 


Input Type: 

Cumulative 

Growth Rate: 

0.2095 


Termination Time: 

1600 

Instant. 

MTBF: 

289.13 





Statistical Results 


Result 


Test Value 

Upper 

Cram'er Von Mises 

Passed 


0.0275 

0.165 


Table 25. RGA 6 PRO analysis summary and Cramer Von Mises test results for Phase 2 


The RGA 6 PRO generated results are presented in Table 25. The Cramer-Von 
Mises statistics of 0.0275 is below the critical value of 0.165 for a significance level of 
0.1. Hence the hypothesis that the AMSAA model is appropriate is accepted. 
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Failure Intensity 


Instantaneous Failure Intensity vs Time 
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Figure 19. Failure intensity vs time plot for Phase 2 
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Instantaneous MKBF vs Time 
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Figure 20. MKBF vs time plot for Phase 2 


According to the idealized growth curve, the expected MKBF at the end of phase 
2 should be approaching 296 kilometers. It is obvious from the instantaneous failure 
intensity versus time plot in Figure 19 that the decrease in the failure intensity is not 
significant throughout the test. Clearly, the main reason is due to the high frequency of 
occurrence of failure mode BC3 as display in Figure 21. At this point the program 
manager must focus on correcting this failure mode. 
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Failure Intensity 


Individual Mode Failure Intensity 
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Figure 21. Failure rate of individual BC failure modes at the end of test 


Lastly, it may also be of interest from an engineering perspective to estimate the 
level of effectiveness of the fixes for the six Type BC failure modes. The average 
effectiveness factor can be calculated as follows [Ref 20]: 


3 Ai ~ A ca 

U BC ~ ' 


^krc) ~h(T / BC) 


(4.23) 


The initial system failure intensity is the inverse of the initial system MRBF in 
equation 4.22. 
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= 0.00641 


Aj Aur 


1 


1 


M, 156 


Since there are no Type A failure mode in the data, the initial failure intensity for 
Type BC failure modes is equals to the initial failure intensity of the system. 

A ca is the achieved system failure intensity at the end of the test phase which has 

been determined. Next is to determine the failure intensity for new Type BC failure 
modes at the end of this test phase. This is the same as equation 4.14 but considers only 
BC modes. 


h(T / BC) = A/3T 




The estimate of A and /? is calculated using equations 4. 5 and 4.6 based on the 
first time occurrence data for BC failure modes in Table 23. 


P 


N 


N\nT-^\nX, 

i =1 


[6 In 1200-(In 89 +In 147 +In 434.6 +In 626.8 +In 1285 +In 1420)] 


= 0.763 


N 6 

A = —- =-———- = 0.0214 


T p 1200 


0.763 


h(T/BC ) = Apr ^ = 0.00304 

Finally the average EF for Type BC failure modes can be determined 


7 A-i-A- CA 

U BC 


= 0.878 


A HRC) -h{T I BC) 


(4.24) 


In conclusion, the six corrective actions remove an average of 87.8 % of the 
failure rate from the six unique failure modes. An average of 12.2 % remained in the six 
BC modes. The average EF of 0.87 is high which implies that the corrective actions that 


have been incorporated are very effective. 
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3. 


Phase 3 results and analysis 


The TAAF approach should be applied in the last test phase so that the 
effectiveness of all corrective actions implemented during the test can be verified by the 
end of the test. Hence a more accurate assessment of the system reliability based on the 
current configuration can be obtained. In Phase 3, all fixes are incorporated during the 
test. The system was tested for 2250 kilometers in this phase. During this phase the 
MKBF of the system was continuously assessed to provide constant technical and 
management visibility on the effectiveness of corrective actions and program status. 

Testing was terminated prematurely at 2200 kilometers instead of the planned 
2250 kilometers because the system reliability has exceeded the requirement. The data 
collected during the test is presented in Table 26. 


j 

Time to Event, 

x.i 

Classification 

Mode 

Failure Category 

1 

36 

BC 

1 

Design 

2 

334 

BC 

2 

Quality 

3 

823.6 

BC 

3 

Faulty component 

4 

958 

BC 

4 

Workmanship 

5 

960 

BC 

5 

Faulty component 

6 

1433 

BC 

6 

Quality 

7 

1741 

BC 

7 

Workmanship 


Table 26. Test-fix-test data for Phase 3 


The AMSAA Extended Test-Fix-Test Model is used to analyze the test data in 
Table 26. 


The shape parameter is estimated using equation 4.5 


AlnT-^lnA 

1=1 


_ 7 _ 

[7 In 2200 - (In 36 + In 334 + In 823.6 + In 958 + In 960 + In 1433 + In 1741)] 


=0.7524 
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The scale parameter is estimated using equation 4.6 



7 

2200 0 ' 7524 


0.0214 


The relationship between the growth rate and the shape parameter is given by 
equation 4.2. 

a DUANE ~ 1 ~~ PaMSAA =0.25 

The growth rate of 0.25 indicates a significant improvement compared to Phase 2. 
The achieved failure intensity is estimated as: 

A ca = r(T ) = Aj5T p - x = 0.0214 * 0.7524 * 2200 OJ524 ~' = 0.00239 

The demonstrated instantaneous MKBF at the end of the test phase after 2200 
kilometers of testing is given in equation 4.4 as the reciprocal of the intensity function: 

1 

M ca =-= 417 kilometers 

KCA 

For a confidence interval of 90 %, the projected MKBF falls between 172 and 
1377 kilometers. 


Another useful metric is the initial system MKBF at the beginning of this phase. It 
is given by equation 4.21. 


M,= 


■M 

p 


— 197 kilometers 


The initial MRBF of 197 kilometers at the beginning of Phase 3 is within the 
confidence interval of 103 and 829 kilometers at the end of Phase 2. At the beginning of 
the test it is estimated that the initial system MKBF was 197 kilometers and due to seven 
distinct fixes the reliability grew to 417 kilometers at the end of 2200 kilometers of test. 
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Model: 

Crow-AMSAA (NHPP) 


Analysis Method: 

MLE 

J3: 

0.7524 


Test Procedure: 

Developmental 

A: 

0.0214 


Input Type: 

Cumulative 

Growth Rate: 

0.2476 


Termination Time: 

2200 

Instant. 

MTBF: 

417.71 





Statistical Results 


Result 


Test Value 

Upper 

Cram’er Von Mises 

Passed 


0.0647 

0.165 


Table 27. RGA 6 PRO analysis summary and Cramer Von Mises test results for Phase 3 


The RGA Pro 6 generated summary and statistical results are presented in Table 
27. The Cramer-Von Mises statistics of 0.0647 is below the critical value of 0.165 for a 
significance level of 0.1. Hence the hypothesis that the AMSAA model is applicable is 
accepted. 

The AMSAA Test-Fix-Test Model estimates that the demonstrated MKBF of 417 
kilometers at the end of 2200 kilometers of testing has exceed the requirement of 350 
kilometers. The test was tenninated at a cumulative mileage 2200 kilometers primarily 
for two reasons: 1) It can be observed from the cumulative number of failures versus time 
plot in Figure 22 that there is a decreasing trend in the number of observed failures 
towards the end of the test and 2) the MKBF requirement has already been exceeded. 
There are only two failures observed in the last 1100 kilometers of testing. The achieved 
growth rate of 0.25 is also a significant improvement compared to Phase 2. The 
effectiveness of the fixes in previous phases have been validated which represents an 
obvious reason for the improved reliability. Also, the j3 value of subsystem B is higher as 
than subsystem A in all three phases of testing which implies a lower growth rate. This is 
expected because subsystem B is an OTS system. 

In conclusion, the implementation of the TAAF approach for subsystem B has 
been successful in surfacing and fixing potential failure modes and thus exceeding the 
reliability target. The system grows from an initial demonstrated MKBF of 200 


77 



Cum. Number of Failures 


kilometers to 417 kilometers. The RGT program has identified reviewed and fixed a total 
of four unique design failure modes and six quality process and control failure modes 
during the three phases of testing leading to positive reliability growth. 


Cumulative Number of Failures vs Time 
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Figure 22. Cumulative number of failures vs time plot for Phase 3 
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Failure Intensity 


Failure Intensity vs Time 


0.02 


0.02 


0.01 


8.00E-3 


4.00E-3 


0 




























Crow Extended 



























Data 1 



























Developmental 

M_E 


















































































































































































































































































































































































































































\ 




































































































































Kim Er 

NPS 

12/8/2005 21:34 




























0 600.00 1200.00 1800.00 2400.00 3000.00 


Time (Kilometers) 


Figure 23. Instantaneous failure intensity vs time plot for Phase 3 
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Mean Kilometer Between Failure (MKBF) 


Instantaneous MKBF vs Time 
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Figure 24. MKBF vs time plot for Phase 3 
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V. CONCLUSIONS 


The key findings of the literature research suggest that the disparity between 
predicted and field reliability stems from some inherent assumptions of the MIL-HDBK- 
217 prediction model. First, the constant failure rate assumption that has been generally 
applied in reliability prediction is not always applicable. However, Drenick’s theorem has 
proven that complex repairable systems, under certain constraints, can be well 
represented by the exponential distribution. The reliability engineer must be able to 
recognize when the mathematical simplicity of the constant failure rate model can be 
used without a substantial penalty in prediction accuracy. Secondly, even if the 
exponential distribution is applicable, the disparity between predicted and field reliability 
may still exist in new systems because of unexpected failure modes that may arise in the 
presence of design and quality deficiencies which will prevent the system from reaching 
the predicted value. One possible solution to reduce the frequency of occurrence of 
unexpected failures in the field and for the system’s reliability to approach the predicted 
value is to apply RGT during the development phase. 

The results of the reliability analysis for the combat system show that the 
demonstrated system reliability for both subsystems is initially low. For subsystem A the 
initial MRBF is only 45 % of the final achieved MRBF. For subsystem B the initial 
MKBF is only 48 % of the final achieved MKBF. However, the reliability for both 
subsystems improves as testing progresses. Reliability is finally estimated to meet the 
predicted value as failure modes are discovered and eliminated through the TAAF 
process. I conclude that the application of RGT during the developmental phase is 
effective in minimizing the disparity between predicted and field reliability. Systems that 
bypass development testing will experience low reliability in the field, which is one of 
the main causes of disparity between predicted and field reliability. 

This thesis has successfully demonstrated the detailed application of the Duane 
Model and the AMSAA Extended Models for the reliability planning and analysis of a 
combat system. Some of the important lessons learned on the use of the reliability growth 
models are summarized below. 
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In reliability growth planning, the total test time required for the RGT program is 
sensitive to the following parameters: 1) system’s initial reliability, 2) initial test time, 3) 
growth rate. In most practical cases, the total test time is usually fixed due to time and 
resources available in the development program. The most accurate way of determining 
the initial system’s reliability is to subject the system to pre-development testing and the 
initial test time must be long enough for at least the first failure to surface. 

The use of failure mode designation such as Type BD and BC associated with the 
AMSAA Extended Growth Models has provided better visibility over the AMSAA Basic 
Test-Fix-Test Model. It allows generation of many useful metrics such as: 1) initial 
system reliability at the beginning of the test, 2) the average effectiveness factor of 
remedying failure modes, 3) fraction of seen and unseen Type BD failure modes, and 4) 
system failure rate breakdown for individual failure modes. Knowing the failure rate 
breakdown of individual failure modes in the system is important as it enables easy 
identification of failure modes with relatively high failure rate. The reliability of 
subsystem A and B continues to grow during the RGT because of focused efforts in 
eliminating these major contributors. In addition, the ability of the AMSAA Extended 
Test-Find-Test Model to estimate the increased in the system’s reliability for Type BD 
failure modes has allowed for a more in depth analysis of the test data. This is especially 
useful at the end of the RGT program when the demonstrated reliability of the system is 
below the target and due to resource constraints further testing is not possible. It is 
therefore important to know if the final system reliability can meet the requirements after 
incorporating the delay fixes. For subsystem A, the Extended Test-Find-Test Model 
estimates the increased in MRBF from 186 rounds to 206 rounds due to three distinct 
delayed fixes with an assumed EF of 0.6 and thus exceeding the MRBF target of 200 
rounds. It is also important to note that the final system reliability is sensitive to the 
assigned value of EF. To prevent over estimation of the system final reliability, a 
conservative EF should be assigned since the actual effectiveness of the delayed fixes 
cannot be detennined without further testing. 

For new systems under development, the use of the AMSAA NHPP model 

provides a better representation of the system’s failure rate than the exponential 

distribution because the failure rate is varying with time as testing progresses. Once the 
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system matures through a period of testing and reliability growth has reached a plateau, 
the system’s failure rate will tend towards being well represented by an exponential 
distribution. 

The use of the various reliability charts such as cumulative failures versus time, 
failure intensity versus time, MRBF/MKBF versus time have also provided management 
and technical visibility on how the system is performing during the test which is also a 
major factor that contributes to the success of the RGT program. 
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